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03.Mar2 ^feEP-sR 

Method for the recombinant expression of an N-terminal lEragment 

of hepatocyte growth factor 

The Invention relates to a method for the recombinant expression of a N-terminal 
fom- kringle-containing fragment of hepatocyte growth fector. 

Background of the Invention 

Hepatocyte growth fector (HGF/SF) is a polypeptide identified and purified by 
Nakamura, T., et al., Biochem. Biophys. Res. Commun. 22 (1984) 1450-1459. It was 
further found that hepatocyte growth fector is identical to scatter fector (SF), 
Weidner, K.M., et al., Proc Natl. Acad. Sci. USA 88 (1991) 7001-7005. HGF is a 
glycoprotein involved in the development of a number of cellular phenotypes 
mcluding proliferation, mitogenesis, formation of branching tubules and, in the 
case of tumor ceUs, invasion and metastasis. For a status review, see Stuart, K.A., et 
al.. Int. J. Exp. Pathol. 81 (2000) 17-30. 

Both rat HGF and human HGF have bean sequenced and cloned (Miyazawa, K. et 
al., Biochem. Biophys- Res. Comm. 163 (1989) 967-973; Nakamura, T., et al.. 
Nature 342 (1989) 440-443; Seki, T., etal., Biochem. and Biophys. Res. Comm. 172 
(1990) 321-327; Tashiro, K., et al., Proc. Nad. Acad. Sci. USA 87 (1990) 3200-3204; 
Okajima, A., et al., Eur. J. Biochem. 193 (1990) 375-381). 

m 

HGF is a protem with high similarity to human plasminogen (38% amino acid 
sequence identity). HGF and plasminogen are both synthesized as a single chain 
polypeptide which is proteolyticaUy processed to a disulfide-linked heterodimer. 
HGF contains an N-terminal domain four consecutive kringle domains and a 
carboxyterminal protease-like domain. Different truncated HGF variants have been 
described. NKl is the shortest HGF variant described. NKl contains amino acids 
32-210 and is truncated after the first kringle domain (Lokker. N.A., and Godowski, 
P.J., J. Biol. Chem. 268 (1993) 17145-17150). NK2 consists of die N-terminal amino 
acid terminus and kringle 1 and kringle 2 and is the naturally occurring product of 
an alternatively spliced HGF mRNA (Chan, A.M., et al.. Science 254 (1991) 1382- 
1385). Further HGF variants containing parts of the heavy chain of HGF (amino 
acids 1-494, containing the alpha-subunit of HGF from amino adds 1-463) are 
described by Lokker, NA., EMBO J. 11 (1992) 2503-2510). 
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It was further found that an HGF/SF fragment, termed NK4, consisting of the 
N-terminal hairpin domain and the four kringle domains of HGF/SF has 
pharmacological properties that are completely different from those of HGF/SF, 
and is an antagonist to the influence of HGF/SF on the motihty and the invasion of 
5 colon cancer ceUs, and is, in addition, an angiogenesis inhibitor that suppresses 

tumor growth and metastasis (Parr, C, et al.. Int. J. Cancer 85 (2000) 563-570; 
Kuba, K., et al.. Cancer Res. 60 (2000) 6737-6743; Date, K., et al., FEES Lett. 420 
(1997) 1-6; Date, K., et aL, Oncogene 17 (1989) 3045-3054). 

NK4 is prepared according to the state of the art (Date, K., et al., FEBS Lett. 420 
10 (1997) 1-6) by recombinant expression of HGF cDNA in CHO cells and subsequent 

digestion with pancreatic elastase. two other isoforms of HGF (NKl and NK2) 
encoding the N-terminal domain and kringle 1, and the N-terminal domain and 
kringles 1 and 2, respectively, were produced in E.coli (Stahl, S.J., Biochem. J. 326 
(1997) 763-772). However, this method results only in about an amount of HGF- 
15 derived proteins which is about 10-20% of the total protein. 

Siinmnarv of the Invention 

The invention provides a method for the production of the alpha-chain of HGF or 
a fragment thereof (NK polypeptide) by expression of a nucleic acid encoding said 
20 NK polypeptide in a microbial host ceU, isolation of inclusion bodies containing 

said NK polypeptide in denatured form, solubilization of the inclusion bodies and 
naturation of the denatured NK polypeptide, characterized in that in said nucleic 
acid at least one of the codons of amino acids selected from the group consisting 
of codons at positions 33, 35 and 36 is CGT. 

25 Amino acid (aa) and codon numbering is according to the sequence shown in 

Swiss-Prot P14210, wherein aa (amino acid) 1-31 denotes signal sequence, aa 32- 
494 denotes alpha chain, aa 128-206 kringle 1, aa 211-288 kringle 2, aa 305-383 
krinlge 3 and aa391-469 krmgle 4. 
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Surprisingly it was found, that modification of at least one of the codons of the 
DNA sequence of positions 33, 35 and 36 (codon 33 and 36 encode arginine, 
numbering according to M73239) results in an increase.of the expression yield of 
about 100% or more. It is ftirther preferred that the codon for ammo add 32 is 



changed from encoding Gin to encoding Ser in order to improve spKtting off N- 
terminal methionine. 



NK polypeptides according to the invention consist of aa 32-494 or a N-terminal 
fragment thereof (always beginning with aa32), preferably fragment aa 32-478, the 
smaUest fragment being aa 32-207. All NK polypeptides according to the invention 
show activity in a scatter assay according to Example 4. 

The invention fiirther provides a nucleic acid encoding an NK polypeptide 
consisting of aa 32-494 or an N-terminal fragment thereof, banning with aa 32, 
preferably fragments aa 32-x, wherein x is a number between 207 and 478, and x is 
preferably 207 or 478, characterized in that, at least one of the codons of amino 
adds selected from the group consisting of codons at positions 33, 35 and 36 is 
CGT. Preferably, all codons at positions 33, 35 and 36 are GOT. 

In a preferred embodiment of the invention aa 32 is changed from glutamine to 
serine to improve homogeneity of the protein ( cleavage of N-terminal 
methionine). 

It is further preferred to introduce two translational stop codons (TAA, TAG 
and/or and TGA) at the end of the nucleic acid encoding the NK polypeptide in 
order to stop the translation at a position equivaltent to the end of desired 
polypeptide. 

Detailed Descriptio n of the Invftnrion 

* 

Human HGF is a disulfide-linked heterodimer, which can be cleaved in an a- 
subunit of 463 amino acids and a P-subunit of 234 amino acids, by cleavage 
between amino acids R494 and V495. The N-terminus of the a-chain is preceded 
by 31 amino acids started with a methionine group. This segment includes a signal 
sequence of 31 amino acids. The a-chain starts at amino add 32 and contains four 
kringle domains. The so-called "hairpin domain" consists of amino adds 70-96. The 
kringle 1 domain consists of amino adds 128-206. The kringle 2 domain consists of 
amino acids 21 1-288, the kringle 3 domain consists of amino acids 305-383, and the 
kringle 4 domain consists of amino adds 391-469 of the a-chain, approximately. 
There exist variations of these sequences, essentially not affecting the biological 



properties of NK polypeptides (especiaUy not affecting its activities antagonistic to 
HGF and its antiangiogenic activities), which variations are described, for example, 
in WO 93/23541. Also the length of NK polypeptides can vary within a few amino 
acids as long as its biological properties are not affected. 

NKl consists of aa 32 to 206-210 of the HGF/SFa-chain, NK2 consists of aa32 to 
288-305 and NK4 is composed of aa 32 to 447 (resp.469-494). Further NK 
polypeptides encoded.by the nucleic adds according to the invention and which 
can be produced recombinantiy according to the invention are described in 
WO 93/23541 and are e.g. 32-207, 32-303, or 32-384. NK polypeptides have the in 
vivo biological activity of causing inhibition of tumor growth, angiogenesis and/or 
metastasis. 

The NK polypeptides can be produced by recombinant means in prokaryotes. For 
expression in prokaryotic host cells, the nucleic acid is integrated into a suitable 
expression vector, according to methods famiUar to a person skiUed in the art. Such 
an expression vector preferably contains a regulatable/inducible promoter. The 
recombinant vector is then introduced for the expression into a suitable host cell 
such as, e.g., E. coU and the transformed ceU is cultured under conditions which 
allow expression of tiie heterologous gene. After fermentation inclusion bodies 
containing denatured NK polypeptide are isolated. 

Escherichia, Sahnonella, Streptomyces or Bacillus are for example suitable as 
prokaryotic host organisms. For tiie production of NK polypeptides prokaryotes are 
transformed in tiie usual manner with die vector which contains the DNA 
according to tiie invention and encoding a NK polypeptide and subsequentiy 
fermented in the usual manner. However expression yield in E. coU using tiie 
original DNA sequence of a NK polypeptide (GenBank M73239) is very low. 

Inclusion bodies are found in tiie cytoplasm as tiie gene to be expressed does not 
contain a signal sequence. These inclusion bodies are separated from otiier cell 
components, for example by centrifiigation after ceU lysis. 

The inclusion bodies were solubilized by adding a denaturing agent like 6 M 
guanidinium hydrocUoride or 8 M urea at pH 7-9 in phosphate buffer (preferably 
in a concentration of 0.1 - 1.0 M, e.g.0.4 M) preferably in tiie presence of DTT 
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(Dithio-l,4-threitol). The solubilisate is dUuted in phosphate buffer pH 7-9 in the 
presence of GSH/GSSG (preferably 2-20 mM, glutahtion) and a denaturing agent in 
a non denaturing concentration (e.g. 2M guanidinium hydrochloride or 4 M urea) 
or preferably instead of guanidinium hydrochloride or urea, arginine in a 
concentration of about 0.3 to 1.0 M, preferably in a concentration of about 0.7M. 
Renaturation is performed preferably at a temperature of about 4 C and for about 
48 to 160 hours. 

According to the state of the art the use of Tris buffer during solubiUzation and . 
naturation leads to a considerable amount (of about 50o/o) of side-products which 
are identified by the inventors as consisting mainly of GSH-modified NK 
polypeptides. To the contrary, it was surprisingly found that the use of potassium 
phosphate buffer in a pH range between 7 and 9, preferably between pH 8 and 9, 
leads to a considerable improvement in yield and purity of NK polypeptides. 

After naturation is terminated the solution was dialyzed preferably against 
phosphate buffer pH 7-9 (preferably in a concentration of OJ - 1.0 M, e.g. 0.3 M) 
for at least 24 hours, preferably for 24 - 120 hours. 

NK polypeptides can be purified after recombmant production and naturation of 
the water insoluble denatured polypeptide (inclusion bodies) according to the 
method of the mvention preferably by chromatographic methods, e.g. by affinity 

20 chromatography, hydrophobii: interaction chromatography, immunoprecipitation, 
gel filtration, ion exchange chromatography, chromatoft>cussing, isoelectric 
focussing, selective precipitation, electrophoresis, or the like. It is preferred to 
purify NK polypeptides by hydrophobic interaction chromatography, preferably at 
pH 7-9, in the presence of phosphate buffer and/or by the use of butyl- or phenyl 

25 sepharose. 

The following examples, references, figure and sequence listing are provided to aid 
the understanding of the present invention, the true scope of which is set forth in 
the appended claims. It is understood that modifications can be made in the 
procedures set forth without departing from the spirit of the invention. 



15 
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Description of the Figure: 

Figure 1: SDS-Gel (10% NuPAGE-SDS, 5[jl per lane, numbering from left 

to right) of NK4 protein in biomass and isolated inclusion bodies 
(IB). 

5 . lane 1: standard 

lane 2: biomass 

lane 3: supemant after centrifugation 
lane 4: supemant after further centrifugation 
lane 5: IB preparation 
10 lane 6: IB preparation after wash 

Description of the Sequences: 

SEQ ID NO:l Amino acid sequence and DNA sequence encoding the a-chain of 

HGF, original sequence according to GenBank M73239 (without 
15 signal sequence) 

SEQ ID NO:2 Protein sequence of the a-chain of HGF 

SEQIDNO:3 Anaino. acid sequence and DNA sequence encoding NK4 

according to the invention (amino acid sequence including N~ 
terminal methionine, DNA sequence including two stop codons) 

20 SEQ ID NO:4 Protein sequence of NK4 

Example 1 

Recombinant expression of NK polypeptides 

The NK4 polypeptide consisting of amino acid position 32 to 478 of HGF was used 
for cloning and recombinant expression in Escherichia coli. The original DNA 
25 sequence used as source of DNA was described (database identifier "gb:M73239"), 

PGR was performed in order to amplify and concurrently modify the DNA coding 
for NK4 (SEQ ID NO: 1). All methods were performed under standard conditions. 



In comparison to the original DNA sequence of NK4> the following modifications 
were introduced: 

Elimination of the eukaryotic signal peptide sequence and fusion of the ATG 
start codon next to amino acid position 32 of NK4 

exchange of amino acid position 32 (position 2 in SEQ ID NO:2) from Gin to 
Ser in order to improve homogeneity of the protein product (Met-free) 
modification of the DNA sequence of the codons of amino acids at position 33 
(AGG to CGT), 35 (AGA to CGT), and 36 (AGA to CGT) in order to improve 

■ 

gene expression in Exoli. 

modification of the DNA sequence of codons at position 477 (ATA to ATG) 
and 478 (GTC to GTT) in order to faciHtate insertion of PGR product into the 
vector 

introduction of two translational stop codons at positions 479 (TAA) and 480 
(TAG), in order to stop the translation at a position equivalent to the end of 
NK4 protein domain- 

The PCR-amplified DNA fragment was treated with restriction endonucleases Ndel 
and Banll and was ligated to the modified pQE vector (Qiagen) (elimination of 
His-tag as well as DHFR coding region), which was appropriately treated with Ndel 
and BanIL The elements of expression plasmid pQE-NK4-iSer (plasmid size 4447 
bp) are T5 promotor/lac operator element, NK4 coding region, lambda to 
transcriptional termination region, tmB Tl transcriptional termination region, 
ColEl origin of replication and |3 -lactamase coding sequence. 

The ligation reaction was- used to transform E.coli competent cells, e.g. E. coli 
strain C600 harbouring expression helper plasmid pUBS520 (EP O 373 365). Exoli 
colonies were isolated and were characterized with respect to restriction and 
sequence analysis of their plamsids. The selection of clones was done by analysis of 
the NK4 protein content after cultivation of recombinant cells in LB medium in the 
presence of appropriate antibiotics and after induction of the gene expression by 
addition of IPTG (ImM). The protein pattern of cell lysates were compared by 
PAGE. The recombinant Exoli clone showing the highest proportion of NK4 
protein was selected for the production process. Fermentation was performed 
under standard conditions and inclusion bodies were isolated. Yield: 130 g/1 net 
weight of cells with 30%-40% NK4 of total protein. 



r 
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NKl and NK2 can be produced recombinandy in an analogous manner. 
Example 2 

Solubilization and naturation 

Inclusion bodies were dissolved over night in a buffer containing 6 M guanidinium 
5 hydrochloride, 0.1 M potassium phosphate pH 83 (hy titration with 10 M KOH), 1 

mM EDTA, 0.01 mM DTT. The concentration of the dissolved protein was 
determined by Biuret assay and finally adjusted to a concentration of 25 mg total 
protein/ml at room temperature. 

This NK-solubilisate was diluted to a concentration of 0.4 mg/ml in a buffer 
10 containing 0.7 M arginine, 0.1 M potassium phosphate pH 8.5 (by titration with 

cone. HCl), 10 mM GSH, 5 mM GSSG and 1 mM EDTA. This renaturation assay 
was incubated between 2 and 8 days at 4°C. After obtaining the maximal 
renaturation efficacy, the renaturation assay of 15 1 volume was concentrated to 3 1 
using a tangential flow filtration unit (MW cut ofiE 10 kDa, Sartor ius). It was 
15 subsequently dialyzed against 3 times 50 1 buffer containing 0.3 M potassium 

phosphate at pH 8.0 for at least 3 x 24 hours, optimally for 5 days in total. 

Example 3 
Purificatioii 

Purification was performed by Heparin-Sepharose chromatography. 

20 Buffer conditions: 

Buffer A: 50 mM Tris pH 8.0 

Buffer B: 50 mM Tris pH 8.0, 2 M NaCl 

Gradient: 5-25% buffer B, 2 column volumes 

25 25-55% buffer B, 16 column volumes 

55-100% buffer B, 0.7 column volumes 

100% buffer B, 2 column volumes 
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To the eluted material 1 M ammonium sulfate in 0.1 M potassium phosphate pH 
8.0 was added and incubated at 4*^C overnight. The sample was centrifiiged and the 
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supernatant was loaded on a Phenyl Sepharose column (150 ml). The column was 
washed with 1 column yoliime 1 M ammonium sulfate, 50 mM potassium 
phosphate pH 8.0. 

Elution conditions: 

5 Buffer A: 1 M ammonium sulfate, 50 mM potassium phosphate pH 8.0 

Buffer B: 50 mM potassium phosphate pH 8.0, 40 % ethylene glycol 
0-lOp % bufifer B, 20 column volumes 

Example 4 
10 Determination of activity 

a) Scatter assay 

MDCK cells were subconfluently grown in tissue culture plates. Cells were treated 
with HGF (10 ng/ml) or with combinations of HGF and NK4. In these e:q>eriments 
the HGF-induced cell scattering was inhibited by the addition of a 10 to 1000-fold 
15 molar excess of NK4 at least for 90% and more, showing the functional activity. 

b) Proliferatio^n assay 

Inhibition of the mitogenic activity of HGF by NK4 was determined by measuring 
DNA synthesis of adult rat hepatocytes in primary culture as described in 
Nakamura et al. (1989). In these e}q>eriments the HGF-induced ceU proliferation 
10 was inhibited by the addition of a 10 to 1000-fold molar excess of NK4 at least for 

90% and more, showing the functional activity. 

c) Invasion assay 

In this assay the invasive potential of tumor cells is analyzed. The assay was done 
essentially as decribed in Albini, A., et al.. Cancer Res. 47 (1987) 3239-3245, using 
5 HT115 cells. Again, HGF-mduced (10 ng/ml) cell invasion could be inhibited by a 

10 tolOOO-fold molar excess of NK4 at least for 90% and more, showing the 
functional activity. 



4 
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Example 5 
Activity in vivo 

Model: Lewis Lung Carcinoma nude mouse tumor model 

1 X 10^ Lewis Lung Carcinoma cells were s.c. implanted into male 

nude mice (BALB/c nu/nu). 
Treatment: After 4 days, one application daily of pegylated NK4 over a period of 

2-4 weeks 
Dose: 1000 |ig/mouse/day 

300 ^ig/mouse/day 

100 fig/mouse/day 

placebo 

Result: Treatment with NK4 shows a dose dependent suppression of 

primary tumor growth and metastasis, whereas no effect is seen in 
placebo treated groups. 
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Patent Claims 



O^Marz 2004 



A nucleic acid encoding the a-chain of hepatocyte growth factor or an N- 
terminal fragment thereof, characterized in that in said nucleic acid at least 
one of the codons of amino acids selected from the group consisting of 
codons at positions 33, 35 and 36 is CGT, 

A nucleic acid according to claim 1, characterized in that the codons of amino 
acids at positions 33, 35 and 36 are CGT. 

Method for the production of a-chain of hepatocyte growth factor or an N- 
terminal fragment thereof (NK polypeptide) by expression of a nucleic acid 
encoding said NK polypeptide in a microbial host cell, isolating of inclusion 
bodies containing said NK polypeptide in denatured form, solubilization of 
the inclusion bodies and naturation of the denatured NK polypeptide, 
characterized in that in said nucleic acid at least one of the codons of amino 
acids selected from the group consisting of codons at positions 33, 35 and 36 



is CGT. 



Method according to claim 3, characterized in that the codons of amino acids 
at positions 33, 35 and 36 are CGT. 
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Abstract 



A method for the production of a-chain of hepatocyte growth factor or an N- 
terminal fragment thereof (NK polypeptide) by expression of a nucleic acid 
encoding said NK polypeptide in a microbial host cell, isolating of inclusion bodies 
containing said NK polypeptide in denatured form> solubilization of the inclusion 
bodies and naturation of the denatured NK polypeptide, which is characterized in 
that in said nucleic acid at least one of the codons of amino acids selected from the 
group consisting of codons at positions 33, 35 and 36 is CGT, restilts in an 
improved expression yield. 
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SEQUENCE LISTING 0 3, MSrZ 2004 

<110> F. Hoffmann-La Roche AG 

<120> Method for the recombinant expression of an N- terminal fragment 
of hepatocyte growth factor 

<130> 22388 

<160> 4 

<170> Patentin version 3.2 

<210> 1 

<211> 1389 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> CDS 

<222> {1)..(1389) 

<223> DNA sequence encoding the alpha-chain of hepatocyte growth factor 
(HGF) 

<400> 1 

caa agg aaa aga aga aat aca 
Gin Arg Lys Arg Arg Asn Thr 
1 5 

act acc eta ate aaa ata gat 
Thr Thr Leu lie Lys lie Asp 

20 

gtg aat act gca gac caa tgt 
Val Asn Thr Ala Asp Gin Cys 

35 . 

ctt cea ttc act tgc aag get 
Leu Pro Phe Thr Cys Lys Ala 
50 55 

tgc etc tgg ttc ccc ttc aat 
Cys Leu Trp Phe Pro Phe Asn 
65 70 

ttt ggc eat gaa ttt gac etc 
Phe Gly His Glu Phe Asp Leu 

85 

tgc ate att ggt aaa gga cgc 
Cys lie lie Gly Lys Gly Arg 

100 



att cat gaa ttc aaa aaa 
lie His Glu Phe Lys Lys 
10 

cea gca ctg aag ata aaa 
Pro Ala Leu Lys lie Lys 
25 

get aat aga tgt act agg 
Ala Asn Arg Cys Thr Arg 
40. 45 

ttt gtt ttt gat aaa gca 
Phe Val Phe Asp Lys Ala 

60 

age atg tea agt gga gtg 
Ser Met Ser Ser Gly Val 

75 

tat gaa aac aaa gac tac 
Tyr Glu Asn Lys Asp Tyr 
90 

age tac aag gga aca gta 
Ser Tyr Lys Gly Thr Val 
105 



tea gca aag 48 
Ser Ala Lys 
15 

acc aaa aaa 96 

Thr Lys Lys. 

30 

aat aaa gga 144 
Asn Lys Gly 



aga aaa caa 192 
Arg Lys Gin 



aaa aaa gaa 240 

Lys Lys Glu 
80 

att aga aac 288 
lie Arg Asn 
95 

tct ate act 336 

Ser lie Thr 

110 



aag agt ggc ate aaa tgt cag ccc tgg agt tec atg ata cea cac gaa 
Lys Ser Gly lie Lys Cys Gin Pro Tirp Ser Ser Met lie Pro His Glu 

115 120 125 



384 
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cac age ttt ttg cct teg age tat egg ggt aaa gac eta cag gaa aac 432 

His Ser Phe Leu Pro Ser Ser Tyr Arg Gly Lys Asp Leu Gin Glu Asn 

130 135 140 

tac tgt cga aat cct cga ggg gaa gaa ggg gga ccc tgg tgt ttc aca 480 

Tyr Cys Arg Asn Pro Arg Gly Glu Glu Gly Gly Pro Trp Cys Phe Thr 

145 150 155 160 

age aat cca gag gta cgc tac gaa gtc tgt gac att cct cag tgt tea 528 

Ser Asn Pro Glu Val Arg ^Tyr Glu Val Cys Asp He Pro Gin Cys Ser 

165 170 175 

gaa gtt gaa tgc atg ace tgc aat ggg gag agt tat cga ggt etc atg 57 6 

Glu Val Glu Cys Met Thr Cys Asn Gly Glu Ser Tyr Arg Gly Leu Met 

180 185 190 

gat cat aca gaa tea ggc aag att tgt cag cgc tgg gat cat cag aca 624 

Asp His Thr Glu Ser Gly Lys He Cys Gin Arg Trp Asp His Gin Thr 
195 200 205 

cca cae egg cac aaa ttc ttg cct gaa aga tat ccc gac aag ggc ttt 672 

Pro His Arg His Lys Phe Leu Pro Glu Arg Tyr Pro Asp Lys Gly Phe 

210 215 220 

gat gat aat tat tgc cgc aat ccc gat ggc cag ccg agg cca tgg tgc 720 

Asp Asp Asn Tyr Cys Arg Asn Pro Asp Gly Gin Pro Arg Pro Tirp Cys 

225 230 235 240 

tat act ett gac cct cac ace cgc tgg gag tac tgt gea att aaa aca 768 

Tyr Thr Leu Asp Pro His Thr Arg Trp Glu Tyr Cys Ala He Lys Thr 

245 250 255 

tgc get gac aat act atg aat gac act gat gtt cct ttg gaa aca act 815 

Cys Ala Asp Asn Thr Met Asn Asp Thr Asp Val Pro Leu Glu Thr Thr 

260 265 270 

gaa tgc ate caa ggt caa gga gaa ggc tac agg ggc act gtc aat acc 864 

Glu Cys He Gin Gly Gin Gly Glu Gly Tyr Arg Gly Thr Val Asn Thr 
275 280 285 

att tgg aat gga att cca tgt cag cgt tgg gat tct cag tat cct cac 912 

He Trp Asn Gly He Pro Cys Gin Arg Trp Asp Ser Gin Tyr Pro His 

290 295 300 

gag cat gac atg act cct gaa aat ttc aag tgc aag gac eta cga gaa 960 

Glu His Asp Met Thr Pro Glu Asn Phe Lys Cys Lys Asp Leu Arg Glu 

305 310 315 320 

aat tac tgc cga aat cca gat ggg tct gaa tea ccc tgg tgt ttt acc 1008 

Asn Tyr Cys Arg Asn Pro Asp Gly Ser Glu Ser Pro Trp Cys Phe Thr 

325 330 335 



act gat cca aac ate cga gtt ggc tac tgc tee caa att cca aac tgt 
Thr Asp Pro Asn He Arg Val Gly Tyr Cys Ser Gin He Pro Asn Cys 

340 345 350 
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« 

gat atg tea cat gga caa gat tgt tat cgt ggg aat ggc aaa aat tat 1104 
Asp Met Ser His Gly Gin. Asp Cys Tyr Arg Gly Asn Gly Lys Asn Tyr 
355 350 365 

atg ggc aac tta tec caa aca aga tct gga eta aca tgt tea atg tgg 1152 
Met Gly Asn Leu Ser Gin Thr Arg Ser Gly Leu Th.r Cys Ser Met Trp 
370 375 380 

gac aag aac atg gaa gac tta cat cgt cat ate ttc tgg gaa cca gat 1200 
Asp Lys Asn Met Glu Asp Leu His Arg His lie Phe Trp Glu Pro Asp 
385 390 395 400 

gca agt aag ctg aat gag aat tac tgc cga aat cca gat gat gat get 1248 
Ala Ser Lys Leu Asn Glu Asn Tyr Cys Arg Asn Pro Asp Asp Asp Ala 

405 410 415 

cat gga ccc tgg tgc tac acg gga aat cca etc att cet tgg gat tat 1296 
His Gly Pro Trp Cys Tyr Thr Gly Asn Pro Leu lie Pro Trp Asp Tyr 

420 425 430 

tgc cct att tct cgt tgt gaa ggt gat acc aca cct aca ata gtc aat 1344 
Cys Pro lie Ser Arg Cys Glu Gly Asp Thr Thr Pro Thr He Val Asn 
435 440 445 

tta gac cat ccc gta ata tct tgt gcc aaa acg aaa caa ttg cga 13 89 

Leu Asp His Pro Val He Ser Cys Ala Lys Thr Lys Gin Leu Arg 
450 455 460 



<210> 2 

<211> 463 

<212> PRT 

<213> Homo sapiens 

<400> 2 

Gin Arg Lys Arg Arg Asn Thr He His Glu Phe Lys Lys Ser Ala Lys 
15 10 15 



Thr Thr Leu He Lys He. Asp Pro Ala Leu Lys He Lys Thr Lys Lys 

20 25 30 



Val Asn Thr Ala Asp Gin Cys Ala Asn Arg Cys Thr Arg Asn Lys Gly 
35 40 45 



Leu Pro Phe Thr Cys Lys Ala Phe Val Phe Asp Lys Ala Arg Lys Gin 
50 55 60 



Cys Leu Trp Phe Pro Phe Asn Ser Met Ser Ser Gly Val Lys Lys Glu 
65 70 75 80 



Ptie Gly His Glu Phe Asp Leu Tyr Glu Asn Lys Asp Tyr He Arg Asn 

85 • 90 95 



Cys He He Gly Lys Gly Arg Ser Tyr Lys Gly Thr Val Ser He Thr 

100 105 110 



Lys Ser Gly He Lys Cys Gin Pro Trp Ser Ser Met He Pro His Glu 
115 120 125 



His Ser Phe Leu Pro Ser Ser Tyr Arg Gly Lys Asp Leu Gin Glu Asn 
130 135 140 



Tyr Cys Arg Asn Pro Arg Gly Glu Glu Gly Gly Pro Trp Cys Phe Thr 
145 150 155 160 



Ser Asn Pro Glu Val Arg Tyr Glu Val Cys Asp He Pro Gin Cys Ser 

165 170 175 



Glu Val Glu Cys Met Thr Cys. Asn Gly Glu Ser Tyr Arg Gly Leu Met 

ISO 185 190 



Asp His Thr Glu Ser Gly Lys He Cys Gin Arg Trp Asp His Gin Thr 
195 200 205 



Pro His Arg His Lys Phe Leu Pro Glu Arg Tyr Pro Asp Lys Gly Phe 
210 * 215 220 



Asp Asp Asn Tyr Cys Arg Asn Pro Asp Gly Gin Pro Arg Pro Trp Cys 
225 230 235 240 



Tyr Thr Leu Asp Pro His Thr Arg Trp Glu Tyr Cys Ala He Lys Thr 

245 250 . 255 



Cys Ala Asp Asn Thr Met Asn Asp Thr Asp Val Pro Leu Glu Thr Thr 

260 265 270 



Glu Cys He Gin Gly Gin Gly Glu Gly Tyr Arg Gly Thr Val Asn Thr 
275 280 285 



He Trp Asn Gly He Pro Cys Gin Arg Trp Asp Ser Gin Tyr Pro His 
290 295 ' 300 



Glu His Asp Met Thr Pro Glu Asn Phe Lys Cys Lys Asp Leu Arg Glu 
305 310 315 320 
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Asn Tyr Cys Arg Asn Pro Asp Gly Ser dlu Ser Pro T3rp Cys Phe Thr 

325 330 335 



. Thx Asp Pro Asn Xle Arg Val Gly Tyr Cys Ser Gin lie Pro Asn Cys 

340 345 350 



Asp Met Ser His Gly Gin Asp Cys Tyr Arg Gly Asn Gly r>ys Asn Tyr 
355 360 365 



Met Gly Asn Lieu Ser Gin Thr Arg Ser Gly Leu Thr Cys Ser Met Tzp 
370 375 380 



Asp Lys Asn Met Glu Asp Leu His Arg His lie Phe Trp Glu Pro Asp 
385 390 395 400 



Ala Ser Lys Leu Asn Glu Asn Tyr Cys Arg Asn Pro Asp Asp Asp Ala 

405 410 415 



His Gly Pro Trp Cys Tyr Thr Gly Asn Pro Leu lie Pro Trp Asp Tyr 

420 425 430 



Cys Pro lie Ser Arg Cys Glu Gly Asp Thr Thr Pro Thr lie Val Asn 
435 440 445 



Leu Asp His Pro Val He Ser Cys Ala Lys Thr Lys Gin Leu Arg 
• 450 455 460 



<210> 3 

<211> 1350 

<212> DNA 

<213> Artificial 

<220> 

<223> dna coding for NK4 
<220> 

<221> CDS 

<222> (1) . . (1350) 

<400> 3 

atg tot cgt aaa cgt cgt aat act att cat gaa ttc aaa aaa tea gca 48 

Met Ser Arg Lys Arg Arg Asn Thr He His Glu Phe Lys Lys Ser Ala 

15 10 15 
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aag act acc eta ate aaa ata gat cca gca ctg aag ata aaa acc aaa 9 6 

Lys Thr Thr Leu lie hys lie Asp Pro Ala Iieii Lys lie Lys Thr Lys 

20 25 30 

aaa gtg aafc act gca gac caa tgt get aat aga tgt act agg aat aaa 144 

liys Val Asn Thr Ala Asp Gin Cys Ala Asn Arg Cys Thr Arg Asn Lys 
35 40 45 

gga ctt cca ttc act tgc aag get ttt gtt ttt gat aaa gca aga aaa 192 

Gly Leu Pro Phe Thr Cys Lys Ala Phe Val Phe.Asp Lys Ala Arg Lys 
50 55 - .60 

caa tgc etc tgg ttc ccc ttc aat age atg tea agt gga gtg aaa aaa 240 

Gin Cys Leu Trp Phe Pro Phe Asn Ser Met Ser Ser Gly Val Lys Lys 
65 70 75 80 • 

gaa ttt ggc cat gaa ttt gac etc tat gaa aac aaa gac tac att aga 288 

Glu Phe Gly His Glu Phe Asp Leu Tyr Glu Asn Lys Asp Tyr lie Arg 

85 90 95 

aac tgc ate att ggt aaa gga cgc age tac aag gga aca gta tct ate 336 

Asn Cys lie lie Gly Lys Gly Arg Ser Tyr Lys Gly Thr Val Ser lie 

100 105 110 

act aag agt ggc ate aaa tgt cag ccc tgg agt tec atg ata cca cac 384 

Thr Lys Ser Gly He Lys Cys Gin Pro Trp Ser Ser Me.t He Pro His 
115 120 125 

gaa cac age ttt ttg cct teg age tat egg ggt aaa gac eta cag gaa 432 

Glu His Ser Phe Leu Pro - Ser Ser Tyr Arg Gly Lys Asp Leu Gin Glu 
130 135 140 

aac tac tgt cga aat cct cga ggg gaa gaa ggg gga ccc tgg tgt ttc 480 

Asn Tyr Cys Arg Asn Pro Arg Gly Glu Glu Gly Gly Pro Trp Cys Phe 
145 150 155 160 

aca age aat cca gag gta cgc tac gaa gtc tgt gac att cct cag tgt 528 

Thr Ser Asn Pro Glu Val Arg Tyr Glu Val Cys Asp He Pro Gin Cys 

165 170 175 

tea gaa gtt gaa tgc atg acc tgc aat ggg gag agt tat cga ggt etc 576 

Ser Glu Val Glu Cys Met Thr Cys Asn Gly Glu Ser Tyr Arg Gly Leu 

180 185 190 

atg gat cat aca gaa tea ggc aag att tgt cag cgc tgg gat cat cag 624 

Met Asp His Thr Glu Ser Gly Lys He Cys Gin Arg Trp Asp His Gin 
195 200 205 

aca cca cac egg cac aaa ttc ttg cct gaa aga tat ccc gac aag ggc 672 

Thr Pro His Arg His Lys Phe Leu Pro Glu Arg Tyr Pro Asp Lys Gly 
210 215 220 

ttt gat gat aat tat tgc cgc aat ccc gat ggc cag ccg agg cca tgg 720 

Phe Asp Asp Asn Tyr Cys Arg Asn Pro Asp Gly Gin Pro Arg Pro Trp 
225 230 235 240 
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tgc tat act ctt gac cct cac acc cgc tgg gag tac tgt gca att aaa 7 68 

Cys Tyr Thr Leu Asp Pro' His Thr Arg Trp Glu Tyr Cys Ala He hys 

245 250 255 

aca tgc get gac aat act atg aat gac act gat gtt cct ttg gaa aca 816 
Thr Cys Ala Asp Asn Thr Met Asn Asp Thr Asp Val Pro Leu Glu Thr 

260 265 270 

act gaa tgc ate caa ggt caa gga gaa ggc tac agg ggc act gtc aat '864 
Thr Glu Cys He Gin Gly Gin Gly Glu Gly Tyr Arg Gly Thr Val Asn 
275 280 285 

acc att tgg aat gga att cca tgt cag cgt tgg gat.tct cag tat cct 912 
Thr He Trp Asn Gly He Pro Cys Gin Arg Trp Asp Ser Gin Tyr Pro 
290 295 300 

cac gag cat gac atg act cct gaa aat ttc aag tgc aag gac eta cga 960 
His Glu His Asp Met Thr Pro Glu Asn Phe Lys Cys Lys Asp Leu Arg 
305 310 315 320 

gaa aat tac tgc cga aat cca gat ggg tct gaa tea ccc tgg tgt ttt 1008 
Glu Asn Tyr Cys Arg Asn Pro Asp Gly Ser Glu Ser Pro Trp Cys Phe 

325 330 335 

acc act gat cca aac ate cga gtt ggc tac tgc tec caa att cca aac 1056 
Thr Thr Asp Pro Asn He Arg Val Gly Tyr Cys Ser Gin He Pro Asn 

340 345 350 

tgt gat atg tea .cat gga caa gat tgt tat cgt ggg aat ggc aaa aat 1104 
Cys Asp Met Ser His Gly Gin Asp Cys Tyr Arg Gly Asn Gly Lys Asn 
355 360 365 

« 

tat atg ggc aac tta tec caa aca aga tct gga eta aca tgt tea atg 1152 
Tyr Met Gly Asn Leu Ser Gin Thr Arg Ser Gly Leu Thr Cys Ser Met 
370 375 380 

tgg gac aag aac atg gaa gac tta cat cgt cat ate ttc tgg gaa cca 1200 
Trp Asp Lys Asn Met Glu Asp Leu His Arg His He Phe Trp Glu Pro 
385 390 395 400 

gat gca agt aag ctg aat gag aat tac tgc cga aat cca gat gat gat 1248 
Asp Ala Ser Lys Leu Asn Glu Asn Tyr Cys Arg Asn Pro Asp Asp Asp 

405 410 415 

get eat gga ccc tgg tgc tac aeg gga aat cca etc att cct tgg gat 1296 
Ala His Gly Pro Trp Cys Tyr Thr Gly Asn Pro Leu He Pro Trp Asp 

420 425 430 

tat tgc cct att tct cgt tgt gaa ggt gat acc aca cct aca ate gtt 1344 
Tyr Cys Pro He Ser Arg Cys Glu Gly Asp Thr Thr Pro Thr He Val 
435 440 445 



taa tag 



<210> 4 
<211> 448 
<212> PRT 



1350 
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<213> Artificial 
<220> 

<223> protein sequence of NK4 
<400> 4 

Met Ser Arg Lys Arg Arg Asn Thr lie His Glu Piie Lys Lys Ser Ala 
1 5 10 . 15 



Lys Thr Thr Leu lie Lys lie Asp Pro Ala Leu Lys lie Lys Thr Lys 

20 25 30 



Lys Val Asn Thr Ala Asp Gin Cys Ala Asn Arg Cys Thr Arg Asn Lys 
35 40 45 



Gly Leu Pro Phe Thr Cys Lys Ala Phe Val Phe Asp Lys Ala Arg Lys 
50 55 60 



Gin Cys Leu Trp Phe Pro Phe Asn Ser Met Ser Ser Gly Val Lys Lys 
65 70 75 80 



Glu Phe Gly His Glu Phe Asp Leu Tyr Glu Asn Lys Asp Tyr lie Arg 

85 90 95 



Asn Cys lie He Gly Lys Gly Arg Ser Tyr Lys Gly Thr Val Ser He 

100 105 110 



Thr Lys Ser Gly He Lys Cys Gin Pro Trp Ser Ser Met He Pro His 
115 120 125 



Glu His Ser Phe Leu Pro Ser Ser Tyr Arg Gly Lys Asp Leu Gin Glu 
130 135 140 



Asn Tyr Cys Arg Asn Pro Arg Gly Glu Glu Gly Gly Pro Trp Cys Phe 
145 150 155 160 



Thr Ser Asn Pro Glu Val Arg Tyr Glu Val Cys Asp He Pro Gin Cys 

165 170 175 



Ser Glu Val Glu Cys Met Thr Cys Asn Gly Glu Ser Tyr Arg Gly Leu 

180 185 190 



Met Asp His Thr Glu Ser Gly Lys He Cys Gin Arg Trp Asp His Gin 
195 200 205 



Thr Pro His Arg His Lys Plie Leu Pro Glu Arg Tyr Pro Asp Lys Gly 
210 215 220 



Phe Asp Asp Asn 
225 



Cys Tyr Thr Leu 



Thr Cys Ala Asp 

260 



Thr Glu ciys lie 
275 



Thr lie Trp Asn 
290 



His Glu His Asp 
305 



Glu Asn Tyar Cys 



Thr Thr Asp Pro 

340 



Cys Asp Met Ser 
355 



Tyr Met Gly Asn 
370 



Trp Asp Lys Asn 
385 



Asp Ala Ser Lys 



Tyr Cys Arg Asn 
230 



Asp Pro His Thr 
245 



Asn Thr Met Asn 



Gin Gly Gin Gly 

280 



Gly lie Pro Cys 
295 



Met Thr Pro Glu 
310 



Arg Asn Pro Asp 
325 



Asn He Arg Val 



His Gly Gin Asp 

360 



Leu Ser Gin Thr 
375 



Met Glu Asp Leu 

390 



Leu Asn Glu Asn 
405 



Pro Asp Gly Gin 
235 



Arg Trp Glu Tyr 
250 



Asp Thr Asp Val 
265 



Glu Gly Tyr Arg 



Gin Arg Trp Asp 

300 



Asn Phe Lys Cys 
315 



Gly Ser Glu Ser 
330 



Gly Tyr Cys Ser 
345 



Cys Tyr Arg Gly 



Arg Ser Gly Leu 

380 



His Arg His He 
395 



Tyr Cys Arg Asn 
410 



Pro Arg Pro Trp 

240 



Cys Ala He Lys 
255 



Pro Leu Glu Thr 
270 



Gly Thr Val Asn 
285 



Ser Gin Tyr Pro 



Lys Asp Leu Arg 

320 



Pro Trp Cys Phe 
335 



Gin He Pro Asn 
350 



Asn Gly Lys . Asn 
365 



Thr Cys Ser Met 



Phe Trp Glu Pro 

400 



Pro Asp Asp Asp 
415 



Ala His Gly Pro Trp Cys Tyr Thr Gly Asn Pro Leu He Pro Trp Asp 

420 425 430 
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Ty^: Cys Pro lie Ser Arg Cys Glu Gly Asp Thr Thr Pro Thr lie Val 
435 440 ' 445 
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